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Abstract 

Background: The Minimal Eating Observation and Nutrition form - version II (MEONF - II) is a recently developed 
nursing nutritional screening tool. However, its inter- and intrarater reliability has not been assessed. 

Methods: Inpatients (n = 24; median age, 69 years; 1 1 women) were assessed by eight nurses (interrater reliability, 
two nurses scored each patient independently) using the MEONF-II on two consecutive days (intrarater reliability, 
each patient was scored by the same nurse day 1 and day 2). 

Results: Six patients were at moderate/high undernutrition risk. Inter- and intrarater reliabilities (Gwet's agreement 
coefficient) for the MEONF-II 2-category classification (no/low risk versus moderate/high risk) were 0.93 and 0.81; for 
the 3-category classification (no/low - moderate - high risk) reliabilities (Gwet's weighted agreement coefficient) 
were 0.98 and 0.88; and total score inter- and intrarater reliabilities (intraclass correlation) were 0.92 and 0.84. 

Conclusion: Reliability of MEONF-II nurse assessments among adult hospital inpatients was supported and the tool 
can be used in research and clinical practice. 
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Background 

The Minimal Eating Observation and Nutrition form 
version II (MEONF-II) [1-3] was developed to be used 
by nurses as it is these that most often conduct the ini- 
tial nutritional screening. Studies have supported the 
validity of the MEONF-II [1-3], and its user-friendliness 
is high among registered nurses [1,2] and among nursing 
students [4]. Although interrater reliability among hospital 
nurses' assessments according to its predecessor (the 
MEONF-I) has been supported (weighted Kappa, 0.81) [5], 
inter- and intrarater reliability of assessments using the 
MEONF-II remains to be documented. 

The MEONF-II (Additional file 1) is based on recom- 
mendations for detecting undernutrition risk [6-8] and 
includes assessments of unintentional weight loss, low 
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BMI/short calf circumference, and eating difficulties. 
The included eating difficulties (food intake, chewing/ 
swallowing, energy/appetite) are based on the Minimal 
Eating Observation Form - Version II (MEOF-II) [9,10]. 
An additional assessment of the presence or absence 
of clinical signs of undernutrition is also included [2]. 
MEONF-II scores range from 0-8 (0-2 = low risk; 3- 
4 = moderate risk; >5 = high risk for undernutrition) 
[3]. MEONF-II has shown acceptable sensitivity (0.61- 
0.73), specificity (0.79-0.88) and accuracy (0.68-0.82) com- 
pared to the Mini Nutritional Assessment (MNA, 18 item 
version) [1-3]. The aim of this study was to evaluate the 
inter- and intrarater reliability of the MEONF-II assess- 
ments among adult inpatients. 

Methods 

Data were collected at a regional Icelandic hospital [11] 
using a cross-sectional test-retest design. The Guidelines 
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for Reporting Reliability and Agreement Studies (GRRAS) 
were followed [12]. 

Participants 

Adult inpatients (>18 years old) in four general adult 
hospital wards (surgery, general medicine, rehabilitation 
for young and old people) were eligible for inclusion. 
Fifty-six (76%) of 74 available inpatients gave informed 
consent to participate, of which every second (n = 28) 
was included in the study of inter- and intrarater reli- 
ability. The other patients were included for the purpose 
of estimating the pointprevalence of undernutrition risk 
at the hospital (not presented here). Four patients were 
not available for intrarater reliability assessment on day 
two and were thus excluded. The final sample was 24 
patients. For test-retest reliability studies a minimal sam- 
ple size of 15 to 20 suffice [13,14]. 

Data collection 

Two registered nurses at each of the four participat- 
ing wards (n = 8) performed the data collection. Be- 
fore commencing data collection, they received brief 
training (45-60 minutes), including pre-testing of the 
forms. 

Data were collected in 2012. Two nurses on each ward 
assessed the patients according to the MEONF-II, in 
parallel with each other (day 1, interrater reliability). The 
nurses did not communicate during the assessment 
or about their findings afterwards. The nurses in 
each pair agreed upon who should be designated to 
be the "first" and "second nurse" (for the purpose of 
interrater reliability). During day 2 the MEONF-II assess- 
ment was repeated, by the same nurse who made the 
assessment of the same patient on day 1, to assess the 
intrarater reliability. 

MEONF-II height and weight measurements were 
conducted using standard equipment available at the in- 
cluded units, and the patients were observed while eat- 
ing and asked about eating difficulties and unintentional 
weight loss. Data collection was made under conditions 
as close as possible to clinical daily routine. 

Instruments 

The data collection forms, including the MEONF-II and 
the manual were translated into Icelandic and back 
translated into Swedish in collaboration with a professional 
translator. 

In addition to the MEONF-II, dependence in activities 
of daily living (ADL) was assessed using a modified Katz 
ADL-index [4,15]. It summarises an individual's overall 
performance in six activities: hygiene; dressing and 
undressing; ability to go to the toilet; mobility; ability 
to control bowels and bladder; and food intake. Patients 
were then classified as almost totally dependent (help in 



5-6 activities), partly dependent (help in 3-4 activities), or 
almost totally independent (help in 2 activities or less) [4]. 

In addition, three single-items regarding fatigue/tired- 
ness, depression and perceived health were applied. The 
fatigue/ tiredness and depression items asked whether re- 
spondents had gotten tired without any specific reason 
and felt gloomy and depressed, respectively, today/ 
during the last days (graded as: not at all; yes, a little; 
yes, quite a lot; a lot) [4,16,17]. The perceived health 
item asked how respondents perceived their health in 
comparison with other people of the same age (grades 
as: not as good as others; as good as others; better than 
others) [4,18]. 

Analysis 

Comparisons were made between those included in the 
study and those not using Chi-square and Mann-Whitney 
U test depending on level of data. Inter- and intrarater reli- 
ability was analysed with proportion exact agreement (PA), 
Kappa statistics (K), quadratic weighted K (K w ), Gwet's 
agreement coefficient (AC1), weighted AC1 (AC1 W ), and 
intraclass correlation coefficient (ICC) and their 95% Con- 
fidence Intervals (95% CIs) [19-22]. Since the MEONF-II 
generates different outcomes, these statistics were primar- 
ily interpreted in association with the following outcomes: 
K and AC1 for the 2-category classification (identifying no/ 
low risk versus moderate/high risk), K w and ACln, for the 
3-category classification (no/low - moderate - high risk), 
and ICC for the total score [23] . 

PA does not account for chance agreement but is use- 
ful as a complement to other statistics, particularly to K 
when there are low frequencies in some cells. The ad- 
vantage of K is that it accounts for both PA and the 
chance agreement, and K w considers partial agreement 
[24] . However, K statistics can sometimes be low despite 
high levels of agreement [23]. In order to facilitate the 
interpretation of K we calculated the maximum obtain- 
able kappa (/<max)» and the prevalence-adjusted bias- 
adjusted kappa (PABAK). If the prevalence index is high, 
i.e. the prevalence of a positive rating is very high or 
low, chance agreement is also high and K is reduced ac- 
cordingly. A bias index close to zero indicates that the 
disagreement between raters is close to symmetrical. 
The higher the bias index is, the higher the K will be. 
Thus, PABAK adjusts for both prevalence and bias. A" max 
shows the maximum obtainable kappa for the set of data 
used [23]. The AC1 statistics is more robust than K sta- 
tistics and has therefore been recommended as an alter- 
native or complement to K [25]. K and AC1 values 
below 0.20 are regarded as poor, 0.21-0.40 as fair, 0.41- 
0.60 as moderate, 0.61-0.80 as substantial and >0.80 as 
almost perfect agreement [25,26]. 

Regarding the ICC we used a two-way mixed model 
(subjects random, ratings fixed, single measurement, 
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absolute agreement) [27]. ICC values should preferably ex- 
ceed 0.8 but values above 0.70 are considered acceptable 
[24]. Data were analysed using IBM SPSS Statistics 20, 
MedCalc 12 and AgreeStat 2011.1 for Windows. 

Ethical considerations 

The ethical committee of Akureyri Hospital (case 2/2012) 
approved the study and the use of personal data from 
the study was notified to the Data Protection Authority 
(S5594/2012). The study conforms to the provisions 
of the Declaration of Helsinki in 1995 (as revised in 
Edinburg 2000) including informed consent and patient 
anonymity [28]. 

Results 

The eight nurses (all females) had a median age of 
51 years (min-max, 40-60 years), had been working as 
nurses for a median of 16 years (7-38 years), and in the 
current hospital for a median of 12 years (4-32 years). 
The nurses were general hospital ward nurses and none 
of them had prior experience with the MEONF-II. 

There were no significant differences between patients 
included in the study and those not (Table 1). 

The prevalence of moderate/high undernutrition risk 
on day 1 was 21-25%. On day 2 the prevalence was 21% 
(Table 2). 

The inter- and intrarater reliabilities (PA/A7AC1) for 
the MEONF-II 2-category classification (no/low risk ver- 
sus moderate/high risk) were >0.81 except for intrarater 
K (0.65). Similarly, for the 3-category classification 
(no/low - moderate - high risk) inter- and intrarater 
reliabilities (PA/iC w /ACl w ) were were >0.83 except for 
intrarater K w (0.62). For the total MEONF-II scores ICC 
values were above 0.80 for both inter- and intrarater reli- 
abilities (Table 3). 

Discussion 

This study addressed the inter- and intrarater reliability 
of MEONF-II nurse assessments of adult hospital inpa- 
tients. Our observations suggest that such assessments 
meet generally acceptable criteria for inter- and intrara- 
ter reliability. 

The sample does not appear to differ from other 
hospital samples, as we found an undernutrition rate 
(moderate/high risk) of 21 or 25%. This is similar to 
previously reported rates according to the MEONF-II. In 
an Icelandic study [11] the baseline prevalence of under- 
nutrition (moderate/high risk) was 25%, and in Swedish 
small, middle and large sized hospitals the corresponding 
rates have been 22-34% [29]. This implies that, from a 
nutritional perspective, the sample in this study seems to 
be a representative hospital sample. Furthermore, there 
are no apparent reasons to question the representative- 
ness of the participating nurses. 



Table 1 Background data for the sample (n = 24) and 
comparisons with those not included (n = 32) regarding 
age, gender and type of ward 

Included, Not included, P-value 



n = 24 n = 32 

Age 0.177 a) 
Median (min-max) 69 (33-92) 69 (22-94) 

n (%) n (%) 

Gender, women 1 1 (46) 1 8 (56) 0.440 b) 

Type of ward 071 0 b) 

Medical 4(17) 8(25) 

Surgical 9(37) 14(44) 

Rehabilitation, young 4(17) 4 (1 2) 

Rehabilitation, older 7(29) 6(19) 



Common diagnose categories 
(can have more than one) 



Lung 


5 (22) 


Cardiovascular 


9(39) 


Gastrointestinal 


5 (22) 


Orthopaedic 


8(35) 


ADL categories 




Almost totally independent 


1 2 (50) 


(max 2 activities) 




Partly dependent (3-4 activities) 


6(25) 


Almost totally dependent 


6(25) 


(5-6 activities) 




Health compared to others 




Not as good as 


14 (58) 


As good as 


1 0 (42) 


Better than others 


0 


Tired without reason 




Not at all 


9(37) 


Yes, a little 


8(33) 


Yes, quite a lot 


3 (12) 


Yes, a lot 


4(17) 


Low-spiritedness 




Not at all 


13 (54) 


Yes, a little 


6(25) 


Yes, quite a lot 


4(17) 


Yes, a lot 


1 (4) 



'Mann-Whitney U test. 
'Chi-square test. 



MEONF-II assessments were found to exhibit gener- 
ally good reliabilities following a relatively low amount 
of training. The interrater reliability was particularly high 
and comparable to the figures found for other nutritional 
screening tools. The interrater reliability for MEONF-II 
total score (K 0.53) is similar to that found for the Mini 
Nutritional Assessment total score (MNA, A" 0.51) between 
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Table 2 Descriptive nutritional risk screening data according to the Minimal Eating Observation and Nutrition Form 
(MEONF-II) 





"First" nurse assessments, 
day 1 a) 


"Second" nurse assessments, 
day 1 a) 


Nurse assessments, 
day 2 a » 


Unintentional weight loss, n {%) 


4(17) 


4(17) 


5 (21) 


Low Body Mass Index or short Calf Circumference, n (%) 


1 (4) 


2(8) 


1 (4) 


Problem with... 








...food intake, n (%) b) 


6(25) 


7(29) 


6(25) 


...swallowing/mouth, n (%) c) 


4(17) 


2 (8) 


8 (33) 


...energy/appetite, n (%) d) 


6(25) 


4(17) 


6(25) 


Clinical signs indicate risk of undernutrition, n (%) 


5 (21) 


4(17) 


4(17) 


MEONF-II total score e) 








Mean (SD) 


1.5 (1.8) 


1.2 (1.5) 


1.6 (1.8) 


Median (q1-q3) 


1 .0 (0-3) 


0.5 (0-2) 


1 .5 (0-2) 


Minimum-Maximum 


0-6 


0-5 


0-6 


MEONF-II risk category 








No/Low risk, n (%) 


18(75) 


1 9 (79) 


1 9 (79) 


Moderate risk, n (%) 


5 (21) 


4(17) 


2(8) 


High risk, n (%) 


1 (4) 


1 (4) 


3 (13) 



al The "first" (n = 4) and "second" nurses (n = 4) conducted parallel but independent assessments of patients on day 1 and repeated their assessments on day 2 
(the same nurses for the same patients as on day 1). 

bl Sitting position; Manipulating food on the plate; Conveying food to the mouth. 
cl Chewing; Coping with food in the mouth; Swallowing. 

dl Eats less than % of served food; Lacks energy to complete an entire meal; Poor appetite. 
e) Possible score range, 0-8 (8 = worst). 



two geriatric clinicians independently assessing 39 in- assessments 12 days), intrarater reliability was 0.89 (ICC) 

hospital patients [30]. Furthermore, a review of the for total MNA scores and 0.78 (K w ) for the 3-category clas- 

Subjective Global Assessment (SGA) found that A" varied sifications [32]. These somewhat better intrarrater reliabil- 

between 0.34 and 0.88, and was highest with experienced ities compared to those observed here may be explained by 

assessors [31]. In another study of the MNA in long-term differences in settings (i.e., hospital inpatients might be less 

geriatric clinics (two nurses with more than one year's stable than residents in long term care facilities). However, 

experience from using the MNA, median time between in line with the observations by Steenson et al. [31] it 



Table 3 Inter- and intrarater reliability of MEONF-II scores (n = 24) as assessed by eight 


nurses 






PA (95% CI) 


K (95% CI) 


AC1 


K w (95% CI) 


AC1„ 


ICC (95% CI) 9 ' 


Interrater-reliability 














2 categories 3 ' 


0.96 (0.87-0.1.0) 


0.88 (0.65-1. 0) c) 


0.93 (0.80-1.0) 


NA 


NA 


NA 


3 categories 6 ' 


0.96 (0.87-1.0) 


0.89 (0.66-1.0) d) 


NA 


0.93 (0.76-1. 0) d) 


0.98 (0.95-1 .0) 


0.93 (0.84-0.97) 


Total score 


0.67 (0.47-0.87) 


0.53 (0.31-0.75) 


NA 


0.91 (0.84-0.99)° 


0.98 (0.96-1 .0) 


0.92 (0.80-0.96) 


Intrarater-reliability 














2 categories 3 ' 


0.88 (0.74-1.0) 


0.65 (0.26-1. Of 1 


0.81 (0.57-1.0) 


NA 


NA 


NA 


3 categories' 1 ' 


0.83 (0.68-0.99) 


0.57 (0.21-0.92)" 


NA 


0.62 (0.19-1.0)" 


0.88 (0.72-1 .0) 


0.63 (0.30-0.82) 


Total score 


0.67 (047-0.87) 


0.56 (0.33-0.78) 


NA 


0.82 (0.66-0.98) 


0.95 (0.91 -.1.0) 


0.84 (0.67-0.93) 



2-category classification = no/low vs moderate/high risk. 
b| 3-category classification = no/low vs moderate vs high risk. 

c) K max = 0.88, Prevalence-adjusted bias-adjusted kappa (PABAK) = 0.92. 

d) Kmax {weighted or unweighted} = 0.89, PABAK = 0.98. 

e) Km a x = 0.88, PABAK = 0.75. 

^max {weighted or unweighted) = 0.89, PABAK = 0.85. 

9) Two-way mixed model for single measures, absolute agreement criterion. 

PA, Proportion exact agreement; K, Kappa; K wr weighted Kappa (quadratic weights); ICC, Intraclass Correlation Coefficient; AC1, Gwet's first order agreement 

coefficient. AC1 W , Gwet's first order agreement coefficient (quadratic weights); NA, not applicable 

Data in boldface indicate the most appropriate statistic in relation to the respective MEONF-II outcomes. 
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appears more likely to be related to the assessors' long ex- 
perience with the tested tool. Additional studies are needed 
to examine the effect of training on the reliability of 
MEONF-II scores. 

Arguably, the most important reliability is that of the 
2-category classification of MEONF-II scores as it initially 
might be used for determining if a patient shall receive a 
nutritional intervention and/or a more thorough assess- 
ment. This reliability was found to be high (AC1, 0.93) be- 
tween raters but lower (albeit still acceptable) within 
raters (AC1, 0.81). This may be due to several possible ex- 
planations. There might have been a change in nutritional 
risk between the two time points, due to improvement or 
decline in health status. Differences in total scores in as- 
sessments between the 24 hours affected the nutritional 
risk classifications of four patients; two in each direction, 
i.e. towards better and worse nutritional risk status, 
respectively. The choice of time period between assess- 
ments is primarily a balance between the risk for changes 
in the underlying trait and recall effects [27]. A second ex- 
planation that also is supported in our data is that it relates 
to the observed score distributions. When there is a low 
bias, kappa is lower than when bias is large, and when 
prevalence is very high or low the chance agreement in- 
crease and reduces K [23], whereas the AC1 statistic ap- 
pears more robust in such situations [25]. The difference 
between K and PABAK provides some support for this ex- 
planation. However, the strongest support is seen in the 
differences between K (0.65), AC1 (0.81) and PA (0.88). In- 
deed, when K is low despite high PA it has been suggested 
to instead rely on the more robust AC1 [25]. Our observa- 
tions provide additional support to that view. However, the 
fact remains that the reliability, even according to AC1, be- 
tween raters was higher than within raters, which therefore 
most likely appears related to actual changes in nutritional 
risk between the two time points. 

According to Kottner et al. (12), reliability values of 
0.60-0.80 may be regarded as sufficient for group-level 
applications, whereas values of at least 0.90 are required 
for individual patient assessments, when important deci- 
sions are based on the assessment [12]. Thus, from this 
perspective the MEONF-II performs well at both a 
group and (in most instances) individual level and can 
be used for both for research purposes as well as in clin- 
ical practice. 

Conclusions 

This study provides support for the inter- and intrarater 
reliability of MEONF-II nurse assessments among adult 
hospital inpatients. Somewhat compromised intrarater 
reliability according to K statistics, despite high propor- 
tions of exact agreement, is likely to represent a meth- 
odological artefact. The MEONF-II can be used in a 
reliable way in research and clinical practice. 



Additional file 



Additional file 1: MEONF-II (Minimal Eating Observation and 
Nutrition Form - Version II). 
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